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Abstract 

The gene encoding the spike (S) protein from two geographically distinct strains 
(American and British) of canine coronavirus (CCV) was cloned and sequenced. The 
nucleotide sequence revealed open reading frames of 1443 or 1453 amino acids, respec¬ 
tively. Structural features include an N-terminal hydrophobic signal sequence, a hydrophilic 
cysteine-rich cluster near the C-terminus, two heptad repeats and 29 or 33 potential 
N-glycosylation sites. Pairwise comparisons of S amino acid sequences from these isolates 
with other CCV strains (Insavcl and K378) revealed that heterogeneity, found mostly in the 
form of conservative substitutions, is distributed throughout the canine sequences. However, 
5 variable regions could be identified. Similar analysis with feline, porcine, murine, chicken 
and human coronavirus sequences revealed that the canine sequences are much more 
closely related to the feline S protein sequence than to the porcine S protein sequences 
even though they are all from the same antigenic group. Moreover, the sequence similarity 
between CCV isolates and the feline coronavirus, feline infectious peritonitis virus (FIPV) 
was comparable. Expression of the CCV or the transmissible gastroenteritis virus (TGEV) S 
gene using the vaccinia virus system produced a protein of the expected size which could 
induce extensive syncytia formation in infected canine A72 cells. 
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Canine coronavirus (CCV) belongs to one of the major antigenic groups of 
coronaviruses (Siddell et al., 1983; Spaan et al., 1988) and is serologically and 
genetically related to feline infectious peritonitis virus (FIPV), feline enteric 
coronavirus (FeCV), transmissible gastroenteritis virus (TGEV) and porcine respi¬ 
ratory' coronavirus (PRCV) (Sanchez et al., 1990, Florsburgh et al., 1992; Wesseling 
et al., 1994). This close genetic relationship indicates that these viruses may have a 
common ancestor (Horzinek et al., 1982). 

The CCV virion is known to contain at least 4 protein species: the 50K 
nucleocapsid protein, N, the 32K integral membrane protein, M, the 9K small 
membrane protein, SM, and the spike glycoprotein, S (Garwes and Reynolds, 
1981; Godet et al., 1992; Horsburgh et al., 1992). The 160K S protein, which forms 
the projecting spike on the surface of the virion (Tyrrell et al., 1968; Spaan et al., 
1988), is synthesized on ribosomes bound to the rough endoplasmic reticulum 
(RER) where it is co-translationally glycosylated (Holmes et al., 1981; Niemann et 
al., 1982; Sturman and Holmes, 1985). After glycosylation, the majority of the S 
proteins are oligomerized into trimeric structures (Cavanagh 1983a,b; Delmas and 
Laude, 1990) and transported to the Golgi where a large number become incorpo¬ 
rated into virions. The S glycoprotein mediates binding of virions to the host cell 
receptor (Williams et al., 1991), is the major target for neutralizing antibodies 
(Spaan et al., 1988), is recognized by T-cell lymphocytes (Korner et al., 1991) and 
induces polykaryocyte formation by cell fusion in cultured cells (Collins et al., 
1982; Sturman and Holmes, 1985). 

Clearly, to understand the biological and pathogenic properties of CCV at the 
molecular level, the primary structure of CCV S genes and data on their processing 
are essential. Indeed, the S protein gene has been cloned for two strains of CCV 
(Horsburgh et al., 1992; Wesseling et al., 1994). However, coronavirus genomes are 
dynamic, subject to recombination, insertion and deletion and it appears that 
deletions do occur at a higher frequency within S; the polymorphism of S found in 
MHV strains accounting for differences in their biological activities is well docu¬ 
mented (Spaan et al., 1988; Parker et al., 1989; Gallagher et al., 1990). We 
expected that cloning and sequencing of two geographically distinct isolates of 
CCV, CCV-6 (American) and CCV-C54 (British), would help to build a consensus 
picture of this virus within a region where deletions do occur at a higher frequency 
with respect to background mutation. 

Oligo(dT) selected CCV genomic RNA from either CCV-6 or CCV-C54 was 
used to generate cDNA libraries (Horsburgh et al., 1992) using oligonucleotides 10 
and 1690 as primers (Fig. 1). It was hoped that using two cDNA libraries would 
increase the likelihood of finding clones that covered the S gene. Inserts from 
recombinant clones of 1 kb or greater were retained for further analyses. Sequenc¬ 
ing the ends of these clones permitted initial alignment of the CCV-6 and CCV-54 
clones with respect to the CCV-Insavcl genome (Horsburgh et al., 1992). This 
approach proved fruitful in that 5 clones were identified which spanned the S 
coding gene of both CCV-6 and CCV-C54 (see Fig. 1). The relationships between 
putative overlapping clones were confirmed by Southern hybridization. The nu¬ 
cleotide sequence of the overlapping cDNA clones was determined using the 
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Fig. 1. Alignment of CCV-6 and CCV-C54 cDNA clones with respect to the CCV-Insavcl genome 
(Horsburgh et al., 1992) using partial sequence information. Overlaps were confirmed by Southern 
blotting. Libraries of cDNA clones were generated from poly(A)-containing infected cell RNA using 
primers 10 and 1690 (GACCTGTAATGACTCAT and TAGGTAGTAACACAACA, respectively) and 
cloned into pUC118 as described by Horsburgh et al. (1992). Restriction enzymes sites used for cloning 
are indicated. B, BamHI; A, AflUI; H, Hindlll (see text). The BamHI site was introduced by 
site-directed mutagenesis using primer 17 (TAATCACTT GGATCC TTAATGTGCC) as described by 
Brierley et al. (1987). 


Sanger dideoxy chain termination method and analysed using the SAP programs of 
Staden (1986). The deduced S gene amino acid sequence from CCV-6 and 
CCV-C54 and the results of comparisons with other S gene protein sequences is 
presented in Fig. 2 and summarized in Table 1. 

Thirty-two nucleotides (nts) upstream of the ATG start codon of either CCV-6 
or CCV-C54 is the sequence CTAAAC. This motif, which is believed to be the 
signal for transcription of subgenomic messenger RNA species (Spaan et al., 1988) 
is found upstream of nearly every coronavirus S gene sequenced so far (DeGroot 
et al., 1987; Luytjes et al., 1987; Schmidt et al., 1987; Britton and Page, 1990; 
Parker et al., 1990; Rasschaert et al., 1990; Britton et al., 1991; Horsburgh et al., 
1992; Mounir and Talbot, 1993; Wesseling et al., 1994). The exceptions are 
IBV-Beaudette, HCV-229E and PEDV which are CTGAAC, CTCAAC and 
GTAAAC, respectively (Binns et al., 1985; Raabe et al., 1990; Duartre and Laude, 
1994). It would appear that the consensus sequence XTXAAC, is conserved 
throughout this family of viruses; however, it is likely that this motif alone is only a 
component of a larger transcription initiation signal as the surrounding sequences 
are quite different in these viruses. This may in part explain the different ratios of 
subgenomic mRNAs found in infected cells. 

Analysis of the CCV S gene sequences revealed features common to all 
coronavirus S proteins; namely, signal sequence, transmembrane anchor, hy¬ 
drophobic profile, the presence of 2 heptad repeats, the C-terminal cysteine rich 
cluster, the KWPWY(W)VWL motif and a large number of potential N-glycosyla- 
tion sites (Fig. 2 and Table 1). The homology that these canine proteins share is 
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Fig. 2. 
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Fig. 2. Alignment of the spike protein amino acid sequences of CCV-6, CCV-C54, CCV-lnsavcl 
(Horsburgh et al., 1992) CCV-K378 (Wesseling et ai., 1994), FIPV (DeGroot et a!., 1987), TGEV 
(Rasschaert and Laude, 1987) and PRCV (Britton et al., 1991) using the CLUSTAL program of Higgins 
and Sharp (1989). Spaces indicate positions for which the amino acid is identical to that of CCV-6 and 
minuses represent putative deletions. The presumed signal sequence (Heijne, 1986) is underlined and 
the conserved motif, KWPWY(W)VWL, is shown in bold. Asterisks indicate regions where the canine 
sequences vary by more than 3 residues. 
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Table 1 

Summary of results 


Coronavirus a 

Transcription 
signal motif 

Cleavage 

sequence 

No. of N-gly- 
cosylation sites b 

Apoprotein 

(kDa) 

Length 
(amino acids) 

Similarity c 
(%) 

CCV-Insavcl 

CTAAAC 

n.a. d 

30 

160 

1452 

100 

CCV-6 

CTAAAC 

n.a. 

29 

158 

1443 

92.7 

CCV-C54 

CTAAAC 

n.a. 

33 

160 

1453 

94.7 

CCV-K378 

CTAAAC 

n.a. 

31 

160 

1453 

93.3 

FIPV 79-1146 

CTAAAC 

n.a. 

35 

160 

1452 

91.1 

TGEV FS772 

CTAAAC 

n.a. 

32 

158 

1447 

78.5 

PRCV 86 

CTAAAC 

n.a. 

29 

135 

1225 

74.5 

HCV-229E 

CTCAAC 

n.a. 

30 

129 

1173 

39.3 

BCV-Que 

CTAAAC 

RRSRR 

20 

150 

1363 

23.9 

MHV-A59 

CTAAAC 

RRARR 

20 

146 

1324 

23.3 

MHV-JHM 

CTAAAC 

RRARR 

21 

136 

1360 

22.1 

IBV-Bea 

CTGAAC 

RRFRR 

28 

127 

1162 

22.1 


a The strains are as described in the text. 
h The potential number of sites. 
c Similarity relative to CCV-Insavcl. 
d n.a., not applicable. 


approximately 93% (Table 1) although the British isolates are more related to each 
other (Insavcl and C54) than to the American isolates (6 and K378). The majority 
of these sequence changes are conservative, however there are 5 regions of 
variability (changes of 4 or more residues (shown in bold in Fig. 2). Interestingly, 
there is some heterogeneity in the length of the canine S protein genes (CCV-6 = 
1443 residues, whereas CCV-C54 = 1453 residues; Table 1). This is not unusual as 
the plasticity of the coronavirus genome, especially within and around the S gene, 
is well documented (Keck et al., 1988; Kusters et al., 1989; Parker et al., 1990; 
Horsburgh et al., 1992). Indeed, the S protein genes from different isolates of 
TGEV and especially MHV may differ in length by as much as 159 amino acids 
(Rasschaert and Laude, 1987; Britton and Page, 1990; Parker et al., 1990). 

Computer-aided analysis of the canine sequences with those of FIPV (DeGroot 
et al., 1987), TGEV (Britton and Page, 1990), PRCV (Britton et al., 1991), 
HCV-229E (Raabe et al., 1990), HCV-OC43 (Mounir and Talbot, 1993), BCV 
(Parker et al., 1990), PEDV (Duatre and Laude, 1994), MHV-A59 (Luytjes et al., 
1987), MHV-JHM (Schmidt et al., 1987) and IBV-Beaudette (Binns et al., 1985) 
revealed that the canine sequences are much more closely related to FIPV than to 
the TGEV and PRCV sequences even though they are all from the same antigenic 
group (Table 1). This observation is highlighted by the finding that CCV and FIPV 
have near identical subgenomic mRNA patterns in infected cells and that both 
CCV and FIPV possess an extra ORF, 7b/6b, which is not present in TGEV or 
PRCV (DeGroot et al., 1988; Horsburgh et al., 1992). Furthermore, it is interesting 
to note that CCV-Insavcl is almost as related to CCV-6 (92.7% identity) as it is to 
FIPV-791146 (91.1% identity), suggesting a very close evolutionary relationship 
between the canine and feline coronaviruses. 
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In an attempt to characterize their biological properties, the canine S glycopro¬ 
teins were expressed using recombinant vaccinia viruses. Briefly, the full length 
coding region of the S gene of CCV-6 was reconstructed from two overlapping 
cDNA clones, BH1 and BH2 (Fig. 1). The 3.0-kb insert from pBHl contains 
sequences from S and the polymerase sequence, lb. In order to express S, the 
polymerase coding sequence had to be removed. A BamHI site was introduced 
immediately 5' of the initiating methionine by site-directed mutagenesis, described 
by Brierley et al. (1987). Mutants were screened by restriction enzyme digestion. 
Positive clones were sequenced across this site. A mutant which had the intro¬ 
duced BamHI site was selected and designated pBHl-bam. This plasmid over¬ 
lapped pBH2 by approximately 300 bp. A unique AflII site was located in this 
region of overlap. The proximal S coding sequence was isolated from pBHl-bam as 
a 1.5-kb AflH-Sphl fragment and ligated into AflII-SphI digested pBH2 generat¬ 
ing pCCV6. The full length S coding sequence was isolated as a 4.4-kb BamHI 
fragment then ligated into the BamHI site of the transfer vector pRK19 to form 
pRKCCV6. pRK19, a gift from Dr. G.L. Smith, contains vaccinia virus TK flanking 
sequences and the 4b late promoter. (A late promoter was utilized as the cis 
signals for termination of early transcription in vaccinia, (T) 5 NT are found several 
times within the S gene coding sequences). The C54 S gene coding sequence was 
assembled from the 3 overlapping clones pBH3, pBH4 and pBHll (Fig. 1). A 
unique BamHI site was created 10 bp upstream of the peplomer ATG start codon 
by site-directed mutagenesis (in the proximal clone), pBH3, generating pBH3-bam. 
A 2.0-kb AflII-EcoRI fragment was isolated from this plasmid and ligated to 
Aflll-EcoRI digested pBH4 forming pBH5'MS. This plasmid was cleaved with 
Hindlll, phosphatased and gel eluted. The 3' coding sequence was excised as a 
1.1-kb Hindlll fragment from pBHll, then ligated to the Hindlll digested 
pBH5'MS generating pBHC54. Plasmid TGEVS, which contains the TGEV strain 
FS772/70 S gene coding sequence was a gift from Dr. Paul Britton, AFRC 
Compton. The gene was excised as a 4.5-kb BamHI fragment and subcloned into 
the BamHI digested transfer vector pRK19 to form pRK19TG. Correct orientation 
of the coronavirus S genes was confirmed by restriction enzyme digestion. Recom¬ 
binant vaccinia viruses were constructed by established procedures (Mackett and 
Smith, 1986). Vaccinia virus infected HeLa cells were transfected with either 
pRKCCV6, pBHC54 or pRK19TG and recombinant viruses selected for as de¬ 
scribed (Mackett and Smith, 1986). Stocks were prepared after 3 rounds of plaque 
purification. The recombinant viruses were called VAc4b-C6, Vac4b-C54 and 
Vac4b-TG respectively. The control recombinant vaccinia virus, Vac4b-gB, ex¬ 
pressing the CMV glycoprotein gB under the control of the 4b promoter, was a gift 
from Dr. Helena Browne, University of Cambridge. 

In vitro characterization revealed that Vac4b-C6, Vac4b-C54 and Vac4b-TG all 
expressed a polypeptide which co-migrated with the CCV (Insavcl) S protein on 
SDS-PAGE gels (data not shown). As expected, sera from either FIPV infected 
cats or CCV infected dogs could immunoprecipitate the recombinant CCV and 
TGEV S proteins (Enjuanes et al., 1990; Sanchez et al., 1990). 

As one of the prominent biological features of coronavirus S proteins in 
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Fig. 3. Biological activity of the recombinant S proteins. A72 cells, used for propagation of CCV, were 
infected (m.o.i. of 0.01) with the recombinant vaccinia viruses Vac4b-C6 (A), Vac4b-C54 (B), Vac4b-gB 
(C), or mock infected (D). Infected cells were stained with 0.1% crystal violet in 20% ethanol and 
photographed under light microscopy. Arrows indicate multinucleated cells. 
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cultured cells is the induction of polykaryocyte formation, canine A72 cells were 
infected with the recombinant vaccinia viruses or CCV, both at an m.o.i. of 0.05 or 
mock infected. The infection was allowed to proceed for 24 h and the cells were 
then fixed with 20% ethanol and stained with 0.1% crystal violet. Syncytia were 
evident in Vac4b-C6, Vac4b-C54 (Fig. 3), Vac4b-TG and CCV infected cells (data 
not shown). Multinucleated cells were not observed in Vac4b-gB and mock 
infected cells (Fig. 3). Syncytia were not evident upon repetition of these experi¬ 
ments in other cell types, e.g. BHK, RK, HeLa and TK~ 143 cells. These data 
show that the recombinant CCV-6, CCV-C54 and TGEV S proteins are expressed 
on the cell surface and that they retain their biological activity. 

The S genes of FIPV, MHV, IBV and TGEV have been expressed using the 
vaccinia virus system (Tomley et al 1987; Pulford et al., 1990; Vennema et al., 
1990). Induction of cell-cell fusion by these recombinant viruses was observed but 
was type-specific. In other words, the recombinant FIPV S protein could only 
induce fusion in cells of feline origin; likewise, polykaryocytosis due to the 
recombinant MFIV S protein was restricted to murine cells (Daya et al., 1989; 
Vennema et al., 1990). Similarly, the recombinant CCV-6 and CCV-C54 S proteins 
could induce cell fusion in canine A72 cells. The observation that the recombinant 
porcine S glycoprotein expressed from Vac4b-TG could induce polykaryocytosis in 
canine A72 cells is not surprising as TGEV, as well as FIPV and CCV can 
replicate in this cell type (S. Chalmers; personal communication). Therefore, it is 
likely that S will induce syncytia only in cells that support replication of the 
parental coronavirus. Fusion may conceivably be mediated by the interaction of S 
with its respective coronavirus cell receptor as cell lines expressing the envelope 
glycoprotein of HIV will only induce syncytia in cells bearing the CD4 receptor 
(Sodroski et al., 1986). As expected, the CCV S protein, like its counterparts from 
FIPV, FeCV, TGEV and PRCV, is probably not cleaved as it lacks the basic amino 
acid motif, RRXRR, at which MHV and IBV are proteolytically cleaved. Further¬ 
more, cleavage products have not been detected by immunoprecipitation studies 
from this work and those reported by Wesseling et al. (1994). 

In conclusion, this work has gone some way towards the elucidation of aspects 
relative to phylogenetic relationships within the coronavirus genus and the reagents 
produced will serve as useful tools in understanding its molecular pathogenesis. 
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